Am29338 


Byte Queue 


ADVANCE INFORMATION 


DISTINCTIVE CHARACTERISTICS 


® Queuing/Dequeuing 
- Allows one to four bytes to be queued or dequeued in 
one cycle. 
© Four 32x9 RAMs 
- Queues up to 128 bytes 
© Asynchronous/Synchronous Operation 
- Supports system communication with different rates 
or with different data sizes. 


®@ Retransmit Capability 
- Allows re-dequeuing of the block data repeatedly. 
© Horizontal Cascading 
- Allows simultaneous output of 1 to 16 bytes in 
synchronous operation. 
Parity Checking 
- Provides data transmission on the inputs and outputs. 


GENERAL DESCRIPTION 


The Am29338 is a general-purpose byte queue that allows 
up to four bytes to be queued and up to four bytes to be 
dequeued in a single cycle. When four byte queues are 
cascaded horizontally, up to sixteen bytes can be de- 
queued in a single cycle. 


With the retransmit capability, the part can repeatedly 
resend the block data stored in the queue without having to 


requeue it. This is useful for retransmitting a block of data 
upon receipt of an error in I/O applications or for loop- 
locking in instruction-prefetch applications, for example. 


Along with the above features, the byte queue operates in 
synchronous or asynchronous mode. These features make 
the part useful as instruction-prefetch queue or as general- 
purpose FIFO buffer. 


BLOCK DIAGRAM 


This document contains information on a product under development at Advanced Micro Devices, 
Inc. The information is intended to help you to evaluate this product. AMD reserves 


the right to change or discontinue work on this proposed product without notice. 
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RELATED AMD PRODUCTS 


Am29334 Four-Port, Dual-Access Register File 


CONNECTION DIAGRAM 
Bottom View 
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PIN DESIGNATIONS 
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METALLIZATION 
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ORDERING INFORMATION 


Standard Products 


AMD standard products are available in several packages and operating ranges. The order number (Valid Combination) is formed by 


a combination of: A. Device Number 
B. Speed Option (if applicable) 
C. Package Type 
D. Temperature Range 
E. Optional Processing 


AM29338 G 


A. DEVICE NUMBER/DESCRIPTION 
Am29338 
Byte Queue 


Valid Combinations 


AM29338 GC, GCB 


- OPTIONAL PROCESSING 
Blank = Standard processing 
B = Burn-in 


. TEMPERATURE RANGE 
C=Commercial (0 to + 85°C) 


. PACKAGE TYPE 
G = 120-Terminal Pin Grid Array (CG 120) 


. SPEED OPTION 
Not Applicable 


Valid Combinations 


Valid Combinations list configurations planned to be 
supported in volume for this device. Consult the local AMD 
sales office to confirm availability of specific valid 
combinations, to check on newly released combinations, and 
to obtain additional data on AMD's standard military grade 
products. 





PIN DESCRIPTION 


A_EMPTY Almost Empty (Output; Active HIGH) 
Indicates that there are less than four bytes of data in the 
queue. It is used in either synchronous or asynchronous 
operation. 


A_FULL Almost Full (Output; Active HIGH) 
Indicates that there are less than four bytes of space 
remaining. It is used in either synchronous or asynchronous 
operation. 


BDQ 9-BDQ3 Bytes Dequeued (input) 
Selects the number of bytes to be dequeued (see Table 2). 
The byte queue must operate synchronously to be able to 
dequeue more than four bytes in a single cycle. 
BQo-BQ,; Bytes Queued (Input) 
Selects the number of bytes to be queued (see Table 1). 
BSW) - BSW, Byte Swap (Input) 
Allows the bytes on the output to be swapped (see Table 3). 
CNTp-CNTg Byte Count (Output) 
Gives the current number of bytes in the queue. These are 
used only in synchronous operation. 
Do-D3; Data Input (Input) 
Data inputs to be queued. 
DQCLK Dequeue Clock (Input) 
Dequeues the number of bytes set up on the Y bus. A LOW- 
to-HIGH transition on this input adjusts the internal dequeue 
pointer by the number set up on the BDQ lines. 
DQEN Dequeue Enable (Input; Active LOW) 
While DQEN is LOW, dequeuing is performed normally. 
When DQEN is HIGH, DQCLK is disabled. 
EMPTY Empty (Output; Active HIGH) 
Indicates that the queue is empty. It is used in either 
synchronous or asynchronous operation. 
FULL Full (Output; Active HIGH) 
Indicates that the queue is full. It is used in either 
synchronous or asynchronous operation. 
OE Output Enable (Input; Active LOW) 
When OE is LOW, the four bytes following the current 
dequeue pointer and the corresponding parity bits are on Y 
and PY outputs. When OE is HIGH, Y and PY outputs are 
three stated. 


FUNCTIONAL DESCRIPTION 
Architecture 


The Am29338 is a 32-bit high-performance byte queue that 
stores up to 128 bytes in the internal RAM slices and queues 
or dequeues up to four bytes in a single cycle. The byte queue 
is divided into five functional blocks: 1) four memory-slice 
logics, 2) byte rotators for input and output buses, 3) rotate- 
enable logic, 4) byte-count logic, and 5) full/empty-generate 
logic. The byte-oriented parity checking is provided on both 
the D-input bus and the Y-output bus. Figure 1 shows a 
detailed block diagram of the byte queue. 


Memory-Slice Logic 


Figure 2 shows a detail of the memory-slice logic. It consists of 
a 32x9 RAM, queue and dequeue pointers, adders for the 
pointers, and a full/empty detector. The RAM has indepen- 
dent 9-bit read and write ports. Both ports are accessible 


PDg-PD3 Data Input Parity (Input) 
The input parity bits for the corresponding byte on the D 
inputs. Only the bytes to be queued and the corresponding 
PD lines are checked for possible parity error. The byte 
queue has the even parity. 


PDERR Data Input Parity Error (Output; Active 
HIGH) 

If any of the bytes to be queued have a parity error, PDERR 
is asserted. 

POSg-POS; Position (Input) 
These inputs are used to program the location of each byte 
queue in horizontally cascaded system upon RESET (see 
Table 4). 


PY9-PY3 Output Data Parity (Output; Three State) 
The output parity bits for Y outputs. When OE is HIGH, the 
parity bits of the four bytes following the dequeue pointer 
appear on these outputs. The byte queue has the even 
parity. 

PYERR = _Y Output Parity Error (Output; Active HIGH) 
If any of the bytes on the output has a parity error, PYERR is 
asserted. 


QCLK Queue Clock (Input) 
When QCLK is LOW, the number of bytes set up on the BQ 
lines are written into the next free space in the queue from 
the data set up on the D inputs. On a LOW-to-HIGH 
transition of this input, the internal queue pointer is updated. 
if QEN is HIGH, QCLK has no effect. 


QEN Queue Enable (Input; Active LOW) 
When QEN is LOW, queuing is performed normally. When 
QEN is HIGH, QCLK is disabled. 


RESET Reset (Input; Active LOW) 
When RESET is LOW, both the internal queue pointer and 
the internal dequeue pointer are reset to the first RAM 
location and both EMPTY and A_EMPTY are asserted. 


RXMIT  Retransmit (Input; Active LOW) 
When RXMIT is LOW, the internal dequeue pointer is reset 
to the first RAM location while the internal queue pointer 
remains unchanged. This allows the data contained 
between the current queue pointer and the first RAM 
location to be retransmitted. 


Yo-Y31 Data Output (Output; Three State) 
The four bytes following the current dequeue pointer appear 
on these outputs when OE is LOW. When OE is HIGH, they 
are three stated. 


simultaneously if different RAM locations are operated on. A 
parity bit is stored along with its corresponding byte into the 
RAM. 


The queue and dequeue pointers point to the next location 
available for dequeuing. The next locations are produced by 
the internal adders with BQp- 4 or BDQo —3 and the current 
pointer values. When RESET is asserted, both pointers are set 
to zero and the RAM is flushed. These pointers are also used 
to indicate that the RAM is either empty or full for each 
memory slice. The slice-empty or slice-full signal is used to 
combinationally form FULL, ALFULL, EMPTY, and ALEMPTY 
signals. 


Byte Rotator 


There are two byte rotators in the byte queue. Each accepts 
36-bit wide data and performs rotation of bytes according to 
the 2-bit rotate values fed from the rotate-enable logic. The 


















input byte rotator realigns and stores the bytes to be queued The queue rotate-enable logic also performs byte and/or word 










into the next free slice location. The output byte rotator swaps on the incoming data. The input bytes are swapped in 
realigns the bytes to be dequeued to the least significant byte one of four ways, according to Table 3, with BSWo - 4 and the 
of the Y-output bus. current modulo-4 byte count through the input byte rotator. 
Rotate-Enable Logic Byte-Count Logic 

The queue and dequeue rotate-enable logic keeps track of This logic consists of a queue count register and a dequeue 
which slice holds the first byte of the next queue/dequeue count register. The registers are incremented during a queue/ 
operation. A modulo-4 counter is used to rotate the data in dequeue operation by the number of bytes in the operation. 
operation and enables the correct slices by the number of The combinational subtract logic outside of these registers 






bytes specified by either BQo_1 or BDQo-3. determines the number of bytes stored in the byte queue. 
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Figure 1. Am29338 Byte Queue Detailed Block Diagram 
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Figure 2. Memory and Slice Logic 
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Figure 3. Position Line Values in Horizontally Cascaded System 
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Key: L=LOW 
H= HIGH 


TABLE 2. SELECTING THE NUMBER OF BYTES TO BE DEQUEUED 


Bytes To Be 


“NI 


= 
* * 


Key: L=LOW 
H= HIGH 


* This is possible when four of the byte queues are cascaded together. The byte queue must be operated 
synchronously to select more than four bytes for dequeuing. 





TABLE 3. ENCODING OF BSW INPUTS 


Key: L=LOW 
H= HIGH 


Note: The assumption is made that the 32-bit data ''A B C D" appears on the input bus. 


TABLE 4. LOCATION IDENTIFICATION FOR HORIZONTAL CASCADING 


Key: L=LOW 
H= HIGH 


Note: "0" stands for the least significant chip and ''3'' the most significant chip. 


Operational Modes 
Synchronous Mode 


Both synchronous and asynchronous operations are available 
for the byte queue. During synchronous operation, both QCLK 
and DQCLK must be asserted on the edge of a common clock 
within certain skew limits. The following signals can be used 
as valid status outputs for this mode: FULL, A_LFULL, EMPTY, 
A_EMPTY, and CNTo — 6. Refer to the applications section for 
an example. 


Asynchronous Mode 


During asynchronous operation, QCLK and DQCLK clocks 
may be different. It is possible to execute queue and dequeue 
operations simultaneously if different locations are accessed. 
In this mode, CNT outputs are not guaranteed as valid and 
horizontal cascading is not possible. Refer to the applications 
section for an example. 


Horizontal Cascading 


In synchronous operation, four byte queues can be horizontal- 
ly cascaded together. In this case, each of the four byte 
queues hold the same data and up to sixteen bytes may be 
dequeued in a single cycle, as shown in Table 2, and Figures 3 
and 4. Each part has to be programmed with its position by the 
POS inputs, as shown in Table 4. In a normal operation, the 
internal dequeue pointer of each part is displaced according to 
the POS inputs. When RESET or RXMIT is asserted, the 
dequeue pointers are offset by the value programmed on the 
POS inputs. 


Horizontal cascading is useful in instruction buffers designed 
for systems with large, variable instructions that can span 
many bytes. 
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APPLICATIONS 


Using Am29338 as an Instruction-Prefetch 
Queue 


Figure 5 shows the Am29338 used as an instruction-prefetch 
queue. Sequential 32-bit memory locations are fetched by the 
Instruction Fetch Unit (IFU) and are queued up in the byte 
queue. When the central processor needs the next instruction, 
it looks at the next four bytes from the byte queue. The central 
processor then determines the instruction length from the 
opcode and updates the dequeue pointer in the byte queue by 
setting up the instruction length on the BDQ lines and 
asserting DQCLK. When a jump occurs, the IFU flushes the 
queue by asserting the RESET input and begins from the new 
address. For this application, the byte queue must be in 
synchronous mode. 


Using the RXMIT input, the byte queue can resend the block 
data through dequeuing rather than having to requeue it. This 
is useful for locking the loops into the byte queue and allows 
the processor to run faster than if it had to refetch instructions 
from memory or cache. Figure 6 illustrates how a loop can 
execute directly out of the byte queue. 


Using Am29338 as a Hardware Mailbox in 
Multiprocessing System 


A mailbox is a communication device between loosely coupled 
processes in a multi-programming system. Messages from 
one process to another are queued in the mailbox on a first-in, 
first-out (FIFO) basis. In a multiprocessing system, hardware 
mailboxes are required. This can be implemented using the 
Am29338 as shown in Figure 7. 


When a process wishes to send a message to the mailbox, it 
calls a special operating-system routine. This routine first 
reads the status of the mailbox; if it is not FULL, the routine 
first writes the message to the mailbox and returns to the 
calling process. If the mailbox is FULL, the operating system 








blocks the calling process on a special queue and enables 
interrupts from the mailbox. When a slot becomes available in 


The mailbox can be extended to operate in a heterogeneous 
multiprocessing system. In this type of system, processors 













the mailbox, the sending processor is interrupted. The inter- with varying data-path widths and clock frequencies are 
rupt routine sends the message to the mailbox, disables interconnected. For example, a 32-bit main processor may 
interrupts from the mailbox, and unblocks the blocked pro- control 8- to 16-bit coprocessors. The ability of the Am29338 
cess. On the receiving side, the EMPTY status of the mailbox to match data-path widths and to queue and dequeue asyn- 
must be available to the receiving processor in order to allow chronously allows processors of different widths and clock 
the receiving process to be blocked if the mailbox is empty. rates to communicate. 






When a mailbox slot becomes filled, a blocked process must 
be awakened by interrupting the receiving processor. 
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Figure 5. Instruction-Prefetch Queue 
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Figure 6. Loop Locking Using Am29338 
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CD010361 


Notes: 1. The noise-generating TTL supply pins are connected directly to their respective power plane. 
2. The noise-sensitive ECL supply pins are interconnected and decoupled as shown, then each 


connected to the respective power plane in one point only. 


3. The heavy lines indicate cuts in the two supply planes that achieve this isolation. 


Figure 8. Am29338 Suggested Printed-Circuit-Board Layout 





ABSOLUTE MAXIMUM RATINGS 


Storage Temperature -65 to +150°C 
Case Temperature 
with Power Applied 
Supply Voltage 
with Respect to Ground 
DC Voltage Applied to Outputs 
for HIGH State -0.5 V to +Vcc Max. 
DC Input Voltage -0.5 V to +5.5 V 


Stresses above those listed under ABSOLUTE MAXIMUM 
RATINGS may cause permanent device failure. Functionality 
at or above these limits is not implied. Exposure to absolute 
maximum ratings for extended periods may affect device 
reliability. 


-55 to +125°C 


-0.5 to +7.0 V 


OPERATING RANGES 


Commercial (C) Devices 
Temperature (Tc) 
Supply Voltage (Vcc) 


0 to +85°C 
+4.75 to +5.25 V 
(under 200 Ifm) 


Operating ranges define those limits between which the 
functionality of the device is guaranteed. 


Note: Recommended operating air velocity is 200 linear 
feet per minute 


DC CHARACTERISTICS over operating ranges unless otherwise specified 


Voc = Min. 


Vin = Vit oF Vin 
A 








oes: wortav | | is | 
ae Pvo=osv |_| | =s0_| 


Voc = Max. to +0.5 V 
Vo =0.5 V 


tet 


Wve iy Gait a Tio=ow vee [| a0 | 200 | 
(Note 4 : Pto= vase f | | e00_| 


Notes: 1. For conditions shown as Min. or Max., use the appropriate value specified under Operating Ranges for the applicable device type. 


2. Typical values are for Vcc = 


+25°C ambient and maximum loading. 


3. Not more than one output should be shorted at a time. Duration of the short-circuit test should not exceed one second. 


4. Measured with all inputs HIGH. 
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SWITCHING CHARACTERISTICS over operating ranges unless otherwise specified 


Combinational Propagation Delays 
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SWITCHING TEST CIRCUITS 
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A. Three-State Outputs B. Normal Outputs 


Notes: 1. C_ = 50 pF includes scope probe, wiring and stray capacitances without device in test fixture. 
2. S1, Se, S3 are closed during function tests and all AC tests except output enable tests. 
3. Sy and Sg are closed while So is open for tpzy} test. 
Sz and So are closed while Sg is open for tpz, test. 
4. C, = 5.0 pF for output disable tests. 
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SWITCHING WAVEFORMS 
KEY TO SWITCHING WAVEFORMS 
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SWITCHING WAVEFORMS (Cont'd.) 
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Notes: 1. Minimum time RESET must be asserted. 
2. This timing diagram is applicable to RXMIT. 
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SWITCHING WAVEFORMS (Cont'd.) 
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Notes on Test Methods 


The following points give the general philosophy that we apply 
to tests that must be properly engineered if they are to be 
implemented in an automatic environment. The specifics of 
what philosophies applied to which test are shown. 


. Ensure the part is adequately decoupled at the test head. 
Large changes in supply current when the device switches 
may cause function failures due to Vcc changes. 


. Do not leave inputs floating during any tests, as they may 
oscillate at high frequency. 


. Do not attempt to perform threshold tests at high speed. 
Following an input transition, ground current may change by 
as much as 400 mA in 5 - 8 ns. Inductance in the ground 
cable may allow the ground pin at the device to rise by 
hundreds of millivolts momentarily. 


. Use extreme care in defining input levels for AC tests. Many 
inputs may be changed at once, so there will be significant 
noise at the device pins which may not actually reach Vj_ or 
Vin until the noise has settled. AMD recommends using 
Vit <O V and Vi 23 V for AC tests. 


. To simplify failure analysis, programs should be designed to 
perform DC, Function, and AC tests as three distinct groups 
of tests. 


6. To assist in testing, AMD offers complete documentation on 
our test procedures and, in most cases, can provide actual 
Sentry programs, under license from Sentry. 


7. Capacitive Loading for A.C. Testing 


Automatic testers and their associated hardware have stray 
capacitance that varies from one type of tester to another, 
but is generally around 50 pF. This, of course, makes it 
impossible to make direct measurements of parameters 
that call for a smaller capacitive load than the associated 
stray capacitance. Typical examples of this are the so- 
called ''float delays’ which measure the propagation 
delays into and out of the high impedance state and are 
usually specified at a load capacitance of 5.0 pF. In these 
cases, the test is performed at the higher load capacitance 
(typically 50 pF) and engineering correlations based on 
data taken with a bench set up are used to predict the 
result at the lower capacitance. 


WFRO2990 


Similarly, a product may be specified at more than one 
capacitive load. Since the typical automatic tester is not 
capable of switching loads in mid-test, it is impossible to make 
measurements at both capacitances even though they may 
both be greater than the stray capacitance. In these cases, a 
measurement is made at one of the two capacitances. The 
result at the other capacitance is predicted from engineering 
correlations based on data taken with a bench set up and the 
knowledge that certain D.C. measurements (IOH, IoL, for 
example) have already been taken and are within specifica- 
tion. In some cases, special D.C. tests are performed in order 
to facilitate this correlation. 


8. Threshold Testing 


The noise associated with automatic testing, the long, 
inductive cables, and the high gain of bipolar devices when 
in the vicinity of the actual device threshold, frequently give 
rise to oscillations when testing high-speed speed circuits. 
These oscillations are not indicative of a reject device, but 
instead, of an overtaxed test system. To minimize this 
problem, thresholds are tested at least once for each input 
pin. Thereafter, "hard" high and low levels are used for 
other tests. Generally this means that function and A.C. 
testing are performed at "hard"' input levels rather than at 
Vit max and Viy min. 


9. A.C. Testing 


Occasionally, parameters are specified that cannot be 
measured directly on automatic testers because of tester 
limitations. Data input hold times often fall into this catego- 
ry. In these cases, the parameter in question is guaranteed 
by correlating these tests with other A.C. tests that have 
been performed. These correlations are arrived at by the 
cognizant engineer by using data from precise bench 
measurements in conjunction with the knowledge that 
certain D.C. parameters have already been measured and 
are within specification. 

In some cases, certain A.C. tests are redundant since they 
can be shown to be predicted by other tests that have 
already been performed. In these cases, the redundant 
tests are not performed. 
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